由于机器学习是当前计算材料科学文献中的流行主题,因此为化合物创建表示形式已成为普遍的位置。这些表示形式很少被比较,因为评估了它们的性能 - 与它们一起使用的算法的性能是非平凡的。由于研究过程引起的许多材料数据集含有偏差和偏斜,因此已经引入了一个集群交叉验证(LOCO-CV),以衡量算法在预测以前看不见的材料基团时的性能。这提出了对Loco-CV测量结果的簇大小范围的影响和控制的问题。我们提出了基于组成的表示之间的详尽比较,并研究了如何使用内核近似功能来更好地分开数据以增强Loco-CV应用程序。我们发现,在大多数测试的任务中,域知识并不能提高机器学习的性能,而带隙预测是显着的例外。我们还发现,径向基函数在所有测试的10个数据集中提高了化学数据集的线性可分离性,并为在Loco-CV过程中应用此函数的应用提供了一个框架,以改善机车-CV测量结果,无论机器学习算法如何,选择度量和复合表示的选择。我们建议将内核Loco-CV作为训练范式,以示材料数据上的算法的外推能力。
translated by 谷歌翻译
本文介绍了STC有限公司的描述,该系统提交给NIST 2021扬声器识别评估,用于固定和开放的培训条件。这些系统由许多不同的子系统组成,基于使用深神经网络作为特征提取器。在NIST 2021 SRE挑战期间,我们专注于培训最先进的深部扬声器嵌入式提取器,如Contive角度裕度的损耗功能。此外,通过自动语音识别中的Wav2Vec 2.0特征的最近成功的启发,我们探讨了这种方法对提交的扬声器验证的有效性。根据我们的观察,预先训练的大wave2vec 2.0模型的微调为开放式条件提供了最佳的开展系统。我们对固定条件的WAV2VEC 2.0提取器的实验表明,与对比预测编码损失的无监督自回归预测将打开从原始语音信号训练强大的变压器的提取器。对于视频模型,我们通过RetinaFace面部探测器和深签名脸部嵌入式提取器开发了我们的最佳解决方案,培训了大面孔图像数据集。主要系统的最终结果是通过在分数水平上的不同配置融合的不同配置而获得,然后进行评分校准。
translated by 谷歌翻译
How can we accurately identify new memory workloads while classifying known memory workloads? Verifying DRAM (Dynamic Random Access Memory) using various workloads is an important task to guarantee the quality of DRAM. A crucial component in the process is open-set recognition which aims to detect new workloads not seen in the training phase. Despite its importance, however, existing open-set recognition methods are unsatisfactory in terms of accuracy since they fail to exploit the characteristics of workload sequences. In this paper, we propose Acorn, an accurate open-set recognition method capturing the characteristics of workload sequences. Acorn extracts two types of feature vectors to capture sequential patterns and spatial locality patterns in memory access. Acorn then uses the feature vectors to accurately classify a subsequence into one of the known classes or identify it as the unknown class. Experiments show that Acorn achieves state-of-the-art accuracy, giving up to 37% points higher unknown class detection accuracy while achieving comparable known class classification accuracy than existing methods.
translated by 谷歌翻译
We present a neural technique for learning to select a local sub-region around a point which can be used for mesh parameterization. The motivation for our framework is driven by interactive workflows used for decaling, texturing, or painting on surfaces. Our key idea is to incorporate segmentation probabilities as weights of a classical parameterization method, implemented as a novel differentiable parameterization layer within a neural network framework. We train a segmentation network to select 3D regions that are parameterized into 2D and penalized by the resulting distortion, giving rise to segmentations which are distortion-aware. Following training, a user can use our system to interactively select a point on the mesh and obtain a large, meaningful region around the selection which induces a low-distortion parameterization. Our code and project page are currently available.
translated by 谷歌翻译
FSS(Few-shot segmentation)~aims to segment a target class with a small number of labeled images (support Set). To extract information relevant to target class, a dominant approach in best performing FSS baselines removes background features using support mask. We observe that this support mask presents an information bottleneck in several challenging FSS cases e.g., for small targets and/or inaccurate target boundaries. To this end, we present a novel method (MSI), which maximizes the support-set information by exploiting two complementary source of features in generating super correlation maps. We validate the effectiveness of our approach by instantiating it into three recent and strong FSS baselines. Experimental results on several publicly available FSS benchmarks show that our proposed method consistently improves the performance by visible margins and allows faster convergence. Our codes and models will be publicly released.
translated by 谷歌翻译
Weakly-supervised object detection (WSOD) models attempt to leverage image-level annotations in lieu of accurate but costly-to-obtain object localization labels. This oftentimes leads to substandard object detection and localization at inference time. To tackle this issue, we propose D2DF2WOD, a Dual-Domain Fully-to-Weakly Supervised Object Detection framework that leverages synthetic data, annotated with precise object localization, to supplement a natural image target domain, where only image-level labels are available. In its warm-up domain adaptation stage, the model learns a fully-supervised object detector (FSOD) to improve the precision of the object proposals in the target domain, and at the same time learns target-domain-specific and detection-aware proposal features. In its main WSOD stage, a WSOD model is specifically tuned to the target domain. The feature extractor and the object proposal generator of the WSOD model are built upon the fine-tuned FSOD model. We test D2DF2WOD on five dual-domain image benchmarks. The results show that our method results in consistently improved object detection and localization compared with state-of-the-art methods.
translated by 谷歌翻译
In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU
translated by 谷歌翻译
Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.
translated by 谷歌翻译
The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.
translated by 谷歌翻译
Occluded person re-identification (ReID) is a person retrieval task which aims at matching occluded person images with holistic ones. For addressing occluded ReID, part-based methods have been shown beneficial as they offer fine-grained information and are well suited to represent partially visible human bodies. However, training a part-based model is a challenging task for two reasons. Firstly, individual body part appearance is not as discriminative as global appearance (two distinct IDs might have the same local appearance), this means standard ReID training objectives using identity labels are not adapted to local feature learning. Secondly, ReID datasets are not provided with human topographical annotations. In this work, we propose BPBreID, a body part-based ReID model for solving the above issues. We first design two modules for predicting body part attention maps and producing body part-based features of the ReID target. We then propose GiLt, a novel training scheme for learning part-based representations that is robust to occlusions and non-discriminative local appearance. Extensive experiments on popular holistic and occluded datasets show the effectiveness of our proposed method, which outperforms state-of-the-art methods by 0.7% mAP and 5.6% rank-1 accuracy on the challenging Occluded-Duke dataset. Our code is available at https://github.com/VlSomers/bpbreid.
translated by 谷歌翻译